Linear Algebra

Table of Contents

1. Vector

  • Vector is a list of numbers
  • Vector is represented by an arrow.
  • Vector is an element of a vector space

2. Vector Space

  • Linear Space
  • \((V, +, \cdot)\) over \(K\)
    • where \( V \) is a set, \( +: V\times V \to V\)(vector addition) and \( \cdot: K\times V \to V \)(scalar multiplication) are functions on \( V \) and \( K \), and \( K \) is a field.

2.1. Definition

  • Associativity of Vector Addition
  • Commutativity of Vector Addition
  • Identity Element of Vector Addition
  • Inverse Elements of Vector Addition
  • Compatibility of Scalar Multiplication with Field Multiplication
  • Identity Element of Scalar Multiplication
  • Distributivity of Scalar Multiplication with respect to Vector Addition
  • Distributivity of Scalar Multiplication with respect to Field Addition

3. Linear Independence

3.1. Definition

3.1.1. For Finite Set of Vectors

The set of vectors \( \{ \mathbf{v}_1, \dots, \mathbf{v}_k \} \) is said to be linearly independent if \[ a_1\mathbf{v}_1 + a_2\mathbf{v}_2 + \cdots + a_k\mathbf{v}_k = \mathbf{0} \implies a_1 = a_2 = \cdots = a_k = 0 \] where \( a_i \) are the elements of the field associated with the underlying vector space.

3.1.2. For Infinite Set of Vectors

  • If every nonempty finite subset of vectors is linearly independent.

3.2. Dimension

The dimension of a vector space \( V \) is the size of the largest linearly independent subset, the basis. Formally, \[ \dim(V) = \mathop{\mathrm{arg\ max}}_k \left( \{ v_1, \dots, v_k \} \text{ linearly independent} \colon v_1,\dots, v_k \in V\right). \]

3.3. Matroid

Generalization of the linearly independent subset.

4. Vector Identites

4.1. Lagrange Identity

\[ (a\times b)\cdot (c\times d) = (a\cdot c)(b\cdot d)-(b\cdot c)(a\cdot d). \]

4.2. Grassmann Identity

\[ a\times (b\times c) = (a\cdot c)b - (a\cdot b)c. \]

4.3. Jacobi Identity

\[ a\times (b\times c) + b\times (c\times a) + c\times (a\times b) = 0. \] Equivalently, the cross product has the product rule: \[ a\times (b\times c) = (a\times b)\times c + b\times (a\times c). \]

5. Matrix

5.1. Square Matrix

Taxonomy_of_Complex_Matrices.svg

Figure 1: taxonomy

5.1.1. Identity Matrix

  • No rectangular identity.

5.1.2. Elementary Matrix

5.1.3. Permutation Matrix

5.1.3.1. Properties
  • \[ P^2 = I \]

5.1.4. Projection Matrix

5.1.4.1. Definition
  • \[ P^2 = P \]

5.1.5. Nilpotent Matrix

\[ \exists k \in \mathbb{N}, N^k = 0. \] The smallest such \( k \) is called the index of \( N \).

5.1.6. Diagonal Matrix

5.1.6.1. Definition
  • All entries are zero, except along the main diagonal.
5.1.6.2. Properties
  • \[ D^{\rm T} = D \]

5.1.7. Triangular Matrix

5.1.8. Hessenberg Matrix

Zeros above the superdiagonal or below the subdiagonal.

5.1.9. Definite Matrix

5.1.9.1. Definition
  • \(\mathbf{x}^{\rm \dagger}M\mathbf{x}\) is definite .
5.1.9.2. Equivalent Statements
  • \(M\) is positive-definite.
  • \(M\) is congruent with a diagonal matrix with positive real entries.
  • \(M\) is Hermitian and all its eigenvalues are real and positive.
  • \(M\) is Hermitian and all its leading principal minors are positive.
  • There exists an invertible matrix \(B\) with conjugate transpose \(B^*\) such that \(M=B^*B\).

5.1.10. Symmetric Matrix

5.1.10.1. Definition
  • \[ S = S^{\rm T} \]
5.1.10.2. Properties
  • Its eigen vectors are orthogonal.
  • The singular value decomposition is of the form: \[ S = U\Sigma U^{-1} \]
    • which is also the 17.3.
  • The symmetry is preserved under the operation \(M^{\rm T}(\cdot )M\).
5.1.10.3. Construction
  • For any matrix \(M\):
    • \(MM^{\rm T}\)
    • \(M + M^{\rm T}\)

5.1.11. Skew-Symmetric Matrix

5.1.11.1. Definition
  • \[ S = -S^{\rm T} \]
5.1.11.2. Construction
  • For any matrix \(M\):
    • \(M-M^{\rm T}\)
  • The skew-symmetry is also preserved under the operation \(M^{\rm T}(\cdot )M\).
5.1.11.3. Symplectic Matrix
  • \(2n\times 2n\) matrix \(M\) with real entries that satisfies: \[ M^\top \Omega M = \Omega \] where \(\Omega\) is a fixed \(2n\times 2n\) nonsigular, skew-symmetric matrix.

5.1.12. Hermitian Matrix

5.1.12.1. Definition
  • \[ H = H^\dagger \]
  • It is an adjoint matrix with respect to the inner product.
  • Note that conjugate transpose of a 2 by 2 matrix is just transpose of the corresponding 4 by 4 matrix.
  • For a real symmetric matrix \(S\) and a real skew-symmetric matrix \(A\): \[ aS + biA \] is Hermitian.

5.1.13. Orthogonal Matrix

  • It is the matrix that preserves the magnitudes of vectors.
  • It contains rotations and reflections.
5.1.13.1. Definition
  • A real square matrix whose columns and rows are orthonormal vectors.
  • \[ Q^{\rm T} = Q^{-1} \]
5.1.13.2. Properties
  • \[ QQ^{\rm T} = Q^{\rm T}Q = I. \]
    • Note that this means that the dot products of the column or row vectors are given by the Kronecker delta.
  • \(Q^{\rm T}\) is also orthogonal.

5.1.14. Special Orthogonal Matrix

5.1.14.1. Definition
  • Orthogonal matrix with determinant of 1.

5.1.15. Unitary Matrix

5.1.15.1. Definition
  • \[ U^\dagger = U^{-1} \]
5.1.15.2. Properties
  • Its determinant is a complex number whose magnitude is 1, i.e. \(e^{i\theta}\).

5.1.16. Special Unitary Matrix

5.1.16.1. Definition
  • Unitary matrix whose determinant is 1.
5.1.16.2. Properties
  • It can be obtained by multiplying a unitary matrix with the square root of its determinant.

5.1.17. Normal Matrix

5.1.17.1. Definition
  • It commutes with its conjugate transpose
  • \[ A^\dagger A =AA^\dagger \]

5.1.18. Complex Diagonalizable Matrix

5.2. Rectangular Matrix

5.2.1. Definition

  • Matrix with \(n\) rows and \(m\) columns.

5.2.2. Properties

5.3. Minor

  • Determinant of a square submatrix.

5.3.1. n-th Minor

  • First Minors of a square matrix \( A \) are the determinants of submatrices of \( A \) with one row and one column removed.
  • n-th minors are with n rows and n columns removed.

5.3.2. Principal Minor

  • The row indices \( I \) and column indices \( J \) of the original matrix that are used in minor is the same \( I = J \).

5.3.3. Leading Principal Minor

  • The square upper-left submatrix.

5.4. Cofactor

5.4.1. Definition

\[ C_{ij} = (-1)^{i+j} M_{ij} \] where \( M_{ij} \) is the first minor with \( i\)-th row and \(j\)-th column removed.

5.4.2. Cofactor Matrix

Cofactor matrix of a matrix \( A \) is the matrix with its \( i,j \) entry is the cofactor \( C_{ij} \) of \( A \).

5.5. Invertible Matrix

  • Nonsingular Matrix, Nondegenerate Matrix
  • Square matrix that is not invertible is called singular or degenerate.

5.5.1. Definition

  • An \(m\times n\) matrix \(A\) over a field \(K\) is called invertible if:
    • \[ \exists B\in K^{n\times m}, AB = I_m\land BA = I_n \]
  • That is, both left inverse and right inverse exists and they are the same, in which case it is called the inverse matrix.

5.5.2. Properties

  • It is necessary that \(m=n\).
  • Inverse matrix can be obtained from the cofactor matrix \( C \) with: \[ A^{-1} = \frac{1}{\det A} C^{\rm T} \]

5.5.3. Inverse Matrix Theorem

  • Matrix Inversion Theorem
5.5.3.1. Statement
  • For a square matrix \(A\) over a field \(K\), the following statements are equivalent:
    • \(A\) is invertible
    • \(\det A \neq 0\)
    • \(A\) has full rank
    • \(A\) is bijective
    • \(\ker A = \{0\}\)
    • Eigenvalues of \(A\) are nonzero

5.6. Pseudo Inverse Matrix

  • pinv(A) in Matlab.

5.6.1. Left Inverse

5.6.1.1. Definition
  • For an \(n\times m\) matrix \(A\), if it is full column rank, that is, \(\mathrm{rank}(A) = m \le n\),
  • The left inverse always exists and there are many, but the best one that minimizes the error is
    • \[ A_\text{left}^{-1} = ( A^TA)^{-1}A^T \]
    • Notice that \[ A^{-1}_\text{left}A = I_n \]
    • \((A^\top A)^{-1}\) is always possible, because \(A^\top A\) is an \(n\times n\) matrix with full rank.
5.6.1.2. Pseudo Inverse
  • If it is used in the opposite order: \[ AA_\text{left}^{-1} = A(A^\top A)^{-1} A^\top \]
  • It is not an identity matrix, but a projection matrix onto the column space.
  • It is used as a pseudo inverse.
5.6.1.3. Derivation
  • Least Squares Formula PROOF - YouTube
  • We want to invert \(\mathbf{y}\), but it is not possible because it is outside of the column space.
  • So instead, we want at least to find: \[ \operatorname*{argmin}_\mathbf{x}\Vert A\mathbf{x}-\mathbf{y}\Vert \]
  • By Hilbert projection theorem, there exists a vector \(\hat{\mathbf{x}}\) that satisfies the condition above.
  • By orthogonality of column vectors \(A^\top\) and \(A\hat{\mathbf{x}}-\mathbf{y}\), \[ A^\top(A\hat{\mathbf{x}}-\mathbf{y})=\mathbf{0} \]
  • therefore, if \(A^\top A\) is invertible, we have \[ \hat{\mathbf{x}}=( A^\top A)^{-1}A^\top\mathbf{y} \]

5.6.2. Right Inverse

5.6.2.1. Definition
  • For an \(n\times m\) matrix \(A\) with full row rank, that is, \(\mathrm{rank}(A) = n \le m\),
  • The best right inverse is \[ A_\text{right}^{-1} = A^{\top}(AA^\top)^{-1} \]
  • Notice \[ AA^{-1}_\text{right} = I_m \]
  • and the inverse of \(AA^\top\) always exists because it is an \(m\times m\) matrix with full rank.
  • It can also be used as a pseudo inverse matrix on the left.

5.6.3. Pseudo Inverse

  • Pseudo inverse matrix \(A^+\) is a matrix satisfying: \[ \forall v \in \text{rowspace}(A), A^+Av = v. \]
    • In the case of left inverse the kernel was trivial and the cokernel was nontrivial, and in the case of right inverse the kernel was nontrivial and the cokernel was trivial.
    • In general, for a matrix with nontrivial kernel and nontrivial cokernel, pseudo inverse can be calculated using the singular value decomposition: \[ A = U\Sigma V^\top \implies A^+ = V\Sigma^+ U^\top \]
      • where \(\sigma^+\) is the diagonal matrix with reciprocal entries, with zeroes untact.

6. Products

6.1. Dot Product

\[ \mathbf{x}\cdot \mathbf{y} := x^iy_i \]

6.2. Cross Product

6.3. Scalar Multiplication

6.4. Matrix-Vector Product

7. Associated Spaces

7.1. Row Space

  • \( \mathrm{R}(A) = \mathrm{C}(A^{\rm T}) \)
  • The span of the row vectors.

7.2. Null Space

  • \( \mathrm{N}(A) \)
  • Kernel, \( \ker(A) \)
  • The set of null vectors: \( A\mathbf{v} = \mathbf{0} \) .
  • The row space and the null space are orthogonal.
  • The dimension of the null space is called the nullity.

7.3. Column Space

  • \( \mathrm{C}(A) \)
  • The span of the column vectors.

7.4. Left Null Space

  • \( \mathrm{N}(A^{\rm T}) \)
  • Cokernel, \( \mathrm{coker}(A) \)
  • The column space and the left null space are orthogonal.

7.5. Eigenspace

  • \( \ker(A-\lambda I) \) where \( \lambda \) is one of the eigenvalues.

8. Rank

  • The dimension of the row space, which is the dimension of the column space.
  • Its also the number of singular values with the corresponding singular vectors multiplied together after # Singular Value Decomposition.

9. Determinant

9.1. Properties

  • It is the scaling factor of a \(n\)-dimensional parallelepiped.
  • It is a homogeneous function. \(\det(cA) = c^n\det(A)\) for an \(n\times n\) matrix \(A\).

9.2. Leibniz Formula for Determinants

  • The determinant of a square matrix \(A\) is: \[ \det(A) = \sum_{\tau\in S_n}\operatorname{sgn}(\tau)\prod_{i=1}^n a_{i\tau(i)} = \sum_{\sigma\in S_n}\operatorname{sgn}(\sigma)\prod_{j=1}^n a_{\sigma(j)j} \] where \(S_n\) is the permutation group group.
    • Alternatively, using the Levi-Civita symbol and the Einstein summation notation: \[ \det(A) = \varepsilon_{i_1\dots i_n} a_{1i_1} \cdots a_{ni_n}. \]

9.3. Laplace Expansion

  • Cofactor Expansion

9.3.1. Definition

  • The determinant of an square matrix \(A\) is the sum of \(a_{ij}C_{ij}\) in along any row or column: \[ \det(A) = \sum_{i=1}^na_{ij}C_{ij} = \sum_{j=1}^na_{ij}C_{ij} \] where \(C_{ij}\) are the cofactors.

9.4. Properties

  • \(\det(A) = \det(A^\top)\)
    • The determinant of the transpose is the reciprocal of the area of one gridbox: \(1/\det(A)\).
  • \(\det(AB) = \det(A)\det(B)\)
  • \(\det(A) = 1/\det(A^{-1})\)

9.5. Of Outermorphism

  • Given a linear transformation \(f: V\to V\), its outermorphism extension \(f: \wedge V \to \wedge V\) is defined to satisfy: \[ f(a\wedge b) = f(a) \wedge f(b). \]
  • Then the determinant of the outermorphism \(f\) is given by: \[ \det(f) = f(I)I^{-1}. \]
  • Determinants in Geometric Algebra - YouTube

10. Trace

10.1. Definition

  • \[ \operatorname{tr}\mathbf{A} = \sum_{i=1}^n a_{ii} \]

10.2. Interpretation

  • Trace is the of the vector field \(\mathbf{Ax}\).1
    • \[ \mathrm{tr}\mathbf{A} = \nabla\cdot (\mathbf{Ax}) \]
  • The sum of the eigenvalues.

10.3. Properties

  • \(\nabla \operatorname{tr}\mathbf{X} = \mathbf{I}\)
    • \(\nabla\) is the .
  • Basis-independent
  • \(\mathrm{tr}(a\mathbf{A} + b\mathbf{B}) = a\mathrm{tr}(\mathbf{A}) + b\mathrm{tr}(\mathbf{B})\)
  • \(\mathrm{tr}(\mathbf{AB}) = \mathrm{tr}(\mathbf{BA})\)
  • \(\mathrm{tr}(\mathbf{A}^{\mathsf{T}}) = \mathrm{tr}(\mathbf{A})\)
  • \[ \operatorname{tr}(\mathbf{A}^k) = \sum_{i=1}^n \lambda_i^k \]
  • \[ \det\left(e^{t\mathbf{A}}\right) = e^{t\operatorname{tr}\mathbf{A}} \]
    • Can be understood with the .
  • \(\mathrm{tr}(\mathbf{A}\otimes \mathbf{B}) = \mathrm{tr}(\mathbf{A})\mathrm{tr}(\mathbf{B})\)
    • where \(\otimes\) can be taken to be Kronecker product, or the ((65ce2b8c-d5a4-43be-bd03-33eb31e4cf9b)) of the linear operators.

10.4. Inequalities

10.4.1. Von Neumann's Trace Inequality

  • For \(A, B \in \mathbb{C}^{n\times n}\) with ordered singular values \((\sigma_i)\) and \((\tau_i)\), \[ |\mathrm{tr}(AB)| \le \sum_{i=1}^n \sigma_i\tau_i \]
  • with equality if and only if \(A\) and \(B\) share singular values.

11. Cramer's Rule

  • It is valid when the system has a unique solution.

11.1. Statement

  • Given a system of \(n\) linear equations for \(n\) unknowns: \[ A\mathbf{x} = \mathbf{b} \]
  • The solution is given in terms of the matrices \(A_i\) obtained by replacing the \(i\)-th column of \(A\) by the column vector \(\mathbf{b}\): \[ x_i = \frac{\det(A_i)}{\det(A)}. \]

11.2. Properties

11.2.1. Inverse Matrix

  • Applying the formula to find the inverse matrix , \[ I = AA^{-1} \]
  • One get: \[ (A^{-1})_{ij} = \frac{\det(\mathbf{a}_1, \dots, \overbrace{\mathbf{e}_j}^\text{$i$th column}, \dots, \mathbf{a}_n)}{\det(\mathbf{a}_1,\dots, \mathbf{a}_n )} = \frac{C_{ij}}{\det(A)} \]
  • where \(C_{ij}\) is the cofactor matrix.

11.3. Cramer's Rule for Rectangular Matrix

  • Generalization of the Cramer's rule
  • The ratio can be found in an underdetermined system of equations.
    • Given \[ \begin{bmatrix} 0 \\ 0\end{bmatrix} = \begin{bmatrix} a_1 & b_1 & c_1 \\ a_2 & b_2 & c_2 \end{bmatrix}\begin{bmatrix} x \\ y \\ z \end{bmatrix}, \]
    • The ratios are identical: \[ \frac{x}{b_1c_2 - c_1b_2} = \frac{y}{c_1a_2 - a_1c_2} = \frac{z}{a_1b_2 - b_1a_2}. \]
      • Note that the denominators are the determinants of the submatrices.
      • Also note that the ratios are in the form of the equation of a line with the direction vector which is the cross product of the normal vectors of the two planes.

11.3.1. Proof

  • Using homogeneous coordinates, let the two plane:

    \begin{cases} \alpha_1\tilde{x} + \alpha_2\tilde{y} + \alpha_3\tilde{z} = 0\\ \beta_1\tilde{x} + \beta_2\tilde{y} + \beta_3\tilde{z} = 0 \end{cases}
  • be two projective lines on a projective plane.
  • Note that dehomogenizing through dividing both side by \(\tilde{z}\):

    \begin{cases} \alpha_1x+ \alpha_2y + \alpha_3 = 0\\ \beta_1x + \beta_2y + \beta_3 = 0 \end{cases}
    • with \(x = \tilde{x}/\tilde{z}, y = \tilde{y}/\tilde{z}\), \(z = \tilde{z}/\tilde{z} = 1\), given that \(\tilde{z}\neq 0\).
    • Or just add a restriction to the homogeneous coordinate since it has extra degree of freedom: \(\tilde{z} = 1\).
  • The affine transformation matrix is \[ \begin{bmatrix} 0 \\ 0\\1\end{bmatrix} = \begin{bmatrix} \alpha_1 & \alpha_2 & \alpha_3 \\ \beta_1 & \beta_2 & \beta_3\\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} x \\ y \\ 1 \end{bmatrix}, \]
  • By the Cramer's rule
  • \[ x = \frac{\begin{vmatrix} 0 & \alpha_2 & \alpha_3 \\ 0 & \beta_2 & \beta_3\\ 1 & 0 & 1 \end{vmatrix}}{\begin{vmatrix} \alpha_1 & \alpha_2 & \alpha_3 \\ \beta_1 & \beta_2 & \beta_3\\ 0 & 0 & 1 \end{vmatrix}} = \frac{\alpha_2\beta_3 - \alpha_3\beta_2}{\alpha_1\beta_2 - \alpha_2\beta_1}, \quad y = \frac{\begin{vmatrix} \alpha_1 & 0 & \alpha_3 \\ \beta_1 & 0 & \beta_3\\ 0 & 1 & 1 \end{vmatrix}}{\begin{vmatrix} \alpha_1 & \alpha_2 & \alpha_3 \\ \beta_1 & \beta_2 & \beta_3\\ 0 & 0 & 1 \end{vmatrix}} = \frac{\alpha_3\beta_1 - \alpha_1\beta_3}{\alpha_1\beta_2 - \alpha_2\beta_1} \]
  • Finally, by substituting the homogeneous coordinate back: \[ \frac{\tilde{x}}{\alpha_2\beta_3 - \alpha_3\beta_2} = \frac{\tilde{y}}{\alpha_3\beta_1 - \alpha_1\beta_3} = \frac{\tilde{z}}{\alpha_1\beta_2 - \alpha_2\beta_1}.\ \square \]

12. Matrix Operations

12.1. Gaussian Elimination

12.1.1. Elementary Row Operations

  • Swap: Swap two rows
    • Determinant gets multiplied by \(-1\)
  • Scale: Scalar multiply a row by a nonzero constant
    • Determinant gets multiplied by that scalar
  • Pivot: Add scalar multiple of a row to another row
    • Determinant does not change

12.1.2. Row Echelon Form

  • It is defined to be the result of Gaussian elimination.
  • Lower left-hand corner of the matrix is filled with zeros as much as possible.
  • It is a upper triangular matrix with the leading coefficients being the pivots.
12.1.2.1. Reduced Row Echelon Form
  • The leading coefficients are all zeros, and the column containing the leading coefficient is zeros elsewhere.
  • It is unique.
  • It is of the form after appropriate column permutation: \[ \begin{bmatrix} I & X \\ 0 & 0 \end{bmatrix}. \]

12.1.3. Gauss-Jordan Elimination

  • Process of reducing to the reduced row echelon form.
  • If the matrix is invertible, then this would reduce the matrix into the identity matrix.
  • This can be used to find the inverse of a matrix by augmenting it with an identity matrix.

12.2. Addition

12.2.1. Matrix Addition

  • Entrywise Sum

12.2.2. Direct Sum

  • The direct sum of \( m\times n \) matrix \( A \) and \( p\times q \) matrix \( B \) is a \( (m+p)\times (n+q) \) matrix: \[ A\oplus B = \begin{bmatrix} A & 0 \\ 0 & B \end{bmatrix}. \]

12.2.3. Kronecker Sum

  • The Kronecker sum of \( n\times n \) matrix \( A \) and \( m\times m \) matrix \( B \) is: \[ A\oplus B = A\otimes I_m + I_n \otimes B \] where \(\otimes\) is the Kronecker product.

12.3. Multiplication

12.3.1. Matrix Multiplication

12.3.2. Kronecker Product

  • Matrix Direct Product
  • \(\otimes\)
  • Special tensor product with a standard choice of basis.
  • If \(A\) is an \(m\times n\) matrix and \(B\) is a \(p\times q\) matrix, then the Kronecker product is a \(pm\times qn\) matrix with: \[ (A\otimes B)_{pr+v,qs+w} = a_{rs}b_{vw}. \]
  • It produces a block matrix

12.3.3. Hadamard Product

  • Element-wise Product, Entrywise Product, Schur Product
  • \(\odot\), \(\circ\)
  • For two matrices \(A\) and \(B\) of the same dimension, the Hadamard product is a matrix of the same dimension with:\ \[ (A\odot B)_{ij} = a_{ij}b_{ij}. \]

12.4. Transpose

12.5. Adjugate

The transpose of the cofactor matrix: \( C^{\top} \).

13. Spectral Theorem

  • "Spectral" means that it is about eigenvalues and eigenvectors.

13.1. Statement

13.1.1. Hermitian Matrix

  • If \(A\) is a Hermitian matrix, then all eigenvalues of \(A\) are real, and its eigenvectors are orthonormal basis that spans the entire vector space.

13.1.2. Normal Matrix

  • \(A\) is normal if and only if there exists a unitary matrix \(U\) such that: \[ A = UDU^\dagger \] where \(D\) is a diagonal matrix.
  • This is a special case of the Schur decomposition.

13.2. Proof

  • Oxford Linear Algebra: Spectral Theorem Proof - YouTube
  • For all symmetric matrices \(A\in M_{n\times n}(\mathbb{R})\), there exists an orthogonal matrix \(P\) constructed from one of the eigenvector \(\mathbf{v}_i\) of \(A\) and its complementary orthonormal basis whose existance is guaranteed by the Gram-Schmidt process.
  • \(P^{-1}AP\) is symmetric—therefore, \(C\) is symmetric—and represented by the block matrix: \[ P^{-1}AP = \begin{bmatrix} \lambda_i & 0 \\ 0 & C \end{bmatrix}. \]

13.2.1. Proof by Induction

  • Assume that for a symmetric matrix \(C\in M_{n-1\times n-1}(\mathbb{R})\), there exists an orthogonal matrix \(Q\) such that \(Q^{-1}CQ\) is a diagonal matrix.
  • Then for a symmetric matrix \(A\in M_{n\times n}(\mathbb{R})\), there exists: \[ R = P\begin{bmatrix}1 & 0 \\ 0 & Q\end{bmatrix} \] where \(P\) is the orthogonal matrix constructed from one of the eigenvectors.
  • This transforms \(A\) into a diagonal matrix: \[ R^{-1}AR = \begin{bmatrix} 1 & 0 \\ 0 & Q^{-1}\end{bmatrix}\begin{bmatrix} \lambda_i & 0 \\ 0 & C\end{bmatrix}\begin{bmatrix} 1 & 0 \\ 0 & Q\end{bmatrix} = \begin{bmatrix} \lambda_i & 0 \\ 0 & Q^{-1}CQ \end{bmatrix}. \]

14. Characteristic Polynomial

  • Secular Equation

14.1. Definition

  • For a matrix \( A \) :

\[ p_A(x) := \det( xI -A) \]

15. Block Matrix

  • Partitioned Matrix

15.1. Definition

  • A matrix interpreted as having been broken into blocks or submatrices, which may be visualized by a collection of horizontal and vertical lines.

15.2. Determinant

15.2.1. Block Triangular Matrix

  • \[ \det\begin{bmatrix}A & 0 \\ C & D \end{bmatrix} = \det(A)\det(D) = \det\begin{bmatrix} A & B \\ 0 & D \end{bmatrix} \]
  • If \(A\) is invertible: \[ \det(M) = \det\begin{bmatrix} A & B \\ C & D \end{bmatrix} = \det(A) \det(M/A). \]
  • If \(D\) is invertible: \[ \det(M) = \det\begin{bmatrix} A & B \\ C & D \end{bmatrix} = \det(M/D) \det(D). \]
  • If the blocks are square matrices of the same size:
    • If \(C\) and \(D\) commute:
      • \[ \det\begin{bmatrix} A & B \\ C & D \end{bmatrix} = \det(AD-BC). \]

15.3. Inversion

  • It can be derived from the inverse of the block LDU deocmposition.
  • If \(A\) is invertible:

    \begin{align*} M^{-1} = \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} &= \begin{bmatrix} I_p & A^{-1}B \\ 0 & I_q \end{bmatrix}^{-1}\begin{bmatrix} A & 0 \\ 0 & M/A \end{bmatrix}^{-1}\begin{bmatrix} I_p & 0 \\ CA^{-1} & I_q \end{bmatrix}^{-1} \\[11pt] &= \begin{bmatrix} I_p & -A^{-1}B \\ 0 & I_q \end{bmatrix}\begin{bmatrix} A^{-1} & 0 \\ 0 & (M/A)^{-1} \end{bmatrix}\begin{bmatrix} I_p & 0 \\ -CA^{-1} & I_q \end{bmatrix} \\[11pt] &= \begin{bmatrix} A^{-1} + A^{-1}B(M/A)^{-1}CA^{-1} & -A^{-1}B(M/A)^{-1} \\ -(M/A)^{-1}CA^{-1} & (M/A)^{-1} \end{bmatrix}. \end{align*}
  • Equivalently, if \(D\) is invertible:

    \begin{align*} M^{-1} = \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} &= \begin{bmatrix} I_p & 0 \\ D^{-1}C & I_q \end{bmatrix}^{-1}\begin{bmatrix} M/D & 0 \\ 0 & D \end{bmatrix}^{-1}\begin{bmatrix} I_p & BD^{-1} \\ 0 & I_q \end{bmatrix}^{-1} \\[11pt] &= \begin{bmatrix} I_p & 0 \\ -D^{-1}C & I_q \end{bmatrix}\begin{bmatrix} (M/D)^{-1} & 0 \\ 0 & D^{-1} \end{bmatrix}\begin{bmatrix} I_p & -BD^{-1} \\ 0 & I_q \end{bmatrix} \\[11pt] &= \begin{bmatrix} (M/D)^{-1} & -(M/D)^{-1}BD^{-1} \\ -D^{-1}C(M/D)^{-1} & D^{-1} + D^{-1}C(M/D)^{-1}BD^{-1} \end{bmatrix}. \end{align*}
  • If \(A\) and \(D\) are both invertible: \[ M^{-1} = \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} = \begin{bmatrix} (M/D)^{-1} & 0 \\ 0 & (M/A)^{-1}\end{bmatrix} \begin{bmatrix} I_p & -BD^{-1} \\ -CA^{-1} & I_q \end{bmatrix}. \]

15.3.1. Weinstein-Aronszajn Identity

  • One of the two matrices in the block-diagonal matrix is invertible exactly when the other is.

15.4. Schur Complement

15.4.1. Definition

  • A \((p+q) \times (p+q)\) matrix \(M\) is a block matrix such that: \[ M = \begin{bmatrix} A_{p\times p} & B_{p\times q} \\ C_{q\times p} & D_{q\times q} \end{bmatrix}. \]
  • If \(D\) is invertible, then the Schur complement of the block \(D\) of the matrix \(M\) is the \(p\times p\) matrix: \[ M/D = A-BD^{-1}C. \]
  • If \(A\) is invertible, then the Schur complement of the block \(A\) of the matrix \(M\) is the \(q\times q\) matrix: \[ M/A = D - CA^{-1}B. \]

15.4.2. Block Gaussian Elimination

  • If \(D\) is invertible:

    \begin{bmatrix} A & B \\ C & D \end{bmatrix} \rightsquigarrow \begin{bmatrix} A & B \\ C & D \end{bmatrix}\begin{bmatrix} I_p & 0 \\ -D^{-1}C & I_q \end{bmatrix} = \begin{bmatrix} M/D & B \\ 0 & D \end{bmatrix}
  • If \(A\) is invertible:

    \begin{bmatrix} A & B \\ C & D \end{bmatrix} \rightsquigarrow \begin{bmatrix} A & B \\ C & D \end{bmatrix}\begin{bmatrix} I_p & -A^{-1}B \\ 0 & I_q \end{bmatrix} = \begin{bmatrix} A & 0 \\ C & M/A \end{bmatrix}
15.4.2.1. Block Gauss-Jordan Elimination
  • If \(D\) is invertible:

    \begin{bmatrix} M/D & B \\ 0 & D \end{bmatrix} \rightsquigarrow \begin{bmatrix} I_p & -BD^{-1} \\ 0 & I_q \end{bmatrix}\begin{bmatrix} M/D & B \\ 0 & D \end{bmatrix} = \begin{bmatrix} M/D & 0 \\ 0 & D \end{bmatrix}
  • If \(A\) is invertible:

    \begin{bmatrix} A & 0 \\ C & M/A \end{bmatrix} \rightsquigarrow \begin{bmatrix} I_p & 0 \\ -CA^{-1} & I_q \end{bmatrix}\begin{bmatrix} A & 0 \\ C & M/A \end{bmatrix} = \begin{bmatrix} A & 0 \\ 0 & M/A \end{bmatrix}
  • This can be done after the block Gaussian elimination.
15.4.2.2. Block LDU Decomposition
  • If \(D\) is invertible:

    \begin{bmatrix} A & B \\ C & D \end{bmatrix} = \begin{bmatrix} I_p & BD^{-1} \\ 0 & I_q \end{bmatrix}\begin{bmatrix} M/D & 0 \\ 0 & D \end{bmatrix}\begin{bmatrix} I_p & 0 \\ D^{-1}C & I_q \end{bmatrix}
  • If \(A\) is invertible:

    \begin{bmatrix} A & B \\ C & D \end{bmatrix} = \begin{bmatrix} I_p & 0 \\ CA^{-1} & I_q \end{bmatrix}\begin{bmatrix} A & 0 \\ 0 & M/A \end{bmatrix}\begin{bmatrix} I_p & A^{-1}B \\ 0 & I_q \end{bmatrix}
  • This is the result of combining the block Gaussian elimination and the block Gauss-Jordan elimination.

15.4.3. Solving Block Linear Equations

  • From a system of linear equations with invertible submatrix \(A\):

    \begin{bmatrix} A & B \\ C & D \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} u \\ v \end{bmatrix}
  • Since \(x = A^{-1}(u-By)\) from the first equation, the second equation reduces to: \[ (D-CA^{-1}B)y = v-CA^{-1}u. \]
    • Alternatively, this can be thought of as applying the inverse of the first matrix in the block LDU decomposition on both side.
  • Then \(y = (M/A)^{-1}(v-CA^{-1}u)\).
  • Substitute this into the first equation yields: \[ x=(A^{-1} + A^{-1}B(M/A)^{-1}CA^{-1})u - A^{-1}B(M/A)^{-1}v. \]

16. Vectorization

16.1. Definition

  • It is the Abstract Algebra.html#org7af5852 between \[ \mathbb{R}^{m\times n} := \mathbb{R}^m\otimes \mathbb{R}^n \cong \mathbb{R}^{mn}. \]
  • It stacks the column vectors.

16.2. Half-Vectorization

  • The vectorization of only the upper triangular portion of a matrix.

17. Matrix Decomposition

17.1. LU Decomposition

row exchanges.

  • Factorize a square matrix \(A\) into the product of a

lower triangular matrix \(L\) and an upper triangular matrix \(U\).

17.1.1. Procedure

  • Start with \(IA\), where \(I\) is the identity matrix. -

Starting from the first column of \(I\), fill each entry with \(i_{n1}\), such that \(a_{11}i_{n1}=a_{n1}\), and perform row operation, \(\mathrm{row}_n\rightsquigarrow\mathrm{row}_n-i_{ni}\mathrm{row}_1\), on \(A\) - Repeat the process for the subsequent columns

17.2. QR Decomposition

QR Factorization, QU Factorization

  • For an invertible matrix \(A\): \[ A = QR \] where \(Q\) is an orthonormal matrix and \(R\) is an upper

triangular matrix.

17.2.1. Computation

transformations, or Givens rotations.

17.2.1.1. Gram-Schmidt Process
  • Obtain the orthonormal vectors from the column vectors via Gram-Schmidt,

which becomes the column vectors of \(Q\).

  • And express each column

vector of \(A\) using the components with respect to the orthonormal vectors, which becomes the upper triangular matrix \(R\).

17.3. Spectral Decomposition

Eigendecomposition, Diagonalization

\(P\) is an orthogonal matrix, and \(\Lambda\) is a diagonal matrix.

  • Equivalently, \( S = \sum_j \lambda_jP_j \) where \(P_j\) are the projectors (projection matrices).
  • \( P \) is called the modal matrix of \( S \), and \( \Lambda \) is called the spectral matrix of \( S \).

17.4. Schur Decomposition

  • Schur Triangulation
  • Any \(A\in M_{n\times n}(\mathbb{C})\) can be expressed

as: \[ A = QUQ^{-1} \] where \(Q\) is a unitary matrix, and \(U\) is an upper triangular matrix called Schur form of \(A\).

17.5. Jordan Normal Form

  • Jordan Canonical Form(JCF)

17.5.1. Definition

  • It is linear operator on a finite-dimensional vector space represented by a Jordan matrix.

17.5.2. Jordan Matrix

17.5.2.1. Definition
  • A block diagonal matrix, where each block along the diagonal is a Jordan block whose diagonal entries are all some same value, and the superdiagonal entries are ones, and zero elsewhere.
17.5.2.2. Properties
  • Jordan matrix \(J\) is the ((65d3617d-9105-4c3b-abc0-65716a949adf)) of Jordan blocks \(J_{\lambda_i, m_i}\): \[ J = \bigoplus_{i=1}^n J_{\lambda_i,m_i}. \]

17.5.3. Properties

  • It is used to analyze the non-diagonalizable matrices—defective matrices.
    • For a defective matrix \(A\) there exists an invertible matrix \(p\) such that: \[ J = P^{-1}AP. \]
  • 28. Similar Matrices and Jordan Form - YouTube

    Every square matrix \(A\) is similar to a Jordan matrix \(J\).

    • \( P \) is called the generalized model matrix.
  • The number of Jordan blocks corresponding to the form \(\lambda_i I + N\), where \(N\) is a nilpotent matrix with superdiagonal entries being 1.
  • Jordan normal form forms a equivalence class of matrices with the equivalence relation of matrix similarity, that is other than family of the diagonalizable matrices.

17.5.4. Multiplicities

17.5.4.1. Algebraic Multiplicity
  • The multiplicity in the characteristic polynomial.
  • It is the sum of the sizes of all Jordan blocks corresponding to the eigenvalue \(\lambda_i\).
17.5.4.2. Geometric Multiplicity
  • \(\dim\big( \ker(A-\lambda_i I)\big)\).
  • It is the number of Jordan blocks corresponding to the eigenvalue \(\lambda_i\).

17.5.5. Calculation

  • Jordan Normal Form 2 | An Example - YouTube
  • Calculate the algebraic multiplicity and the geometric multiplicity to determine the Jordan blocks.
  • If indeterminate, calculate further:
    • The number of Jordan blocks corresponding to \(\lambda_i\) of size at least \(j\) is: \[ \dim\ker\left((A-\lambda_iI)^j\right) - \dim\ker\left((A-\lambda_i I)^{j-1}\right). \]
    • This is because a Jordan block is of form \(\lambda_i I + N\) and \(N\) has the degree of nilpotency equal to (the number of missing eigenvectors + one).
    • Subtracting the diagonal matrix \(\lambda_i I\) leaves only the nilpotent matrix \(N\), and exponentiating it increments the dimension of the eigenspace, only if there exists the (exponent - 1) missing eigenvectors.
    • Note:

      \begin{align*} \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}^2 & = \begin{bmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}\\ \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}^3 &= \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}. \end{align*}

17.6. Jordan-Chevalley Decomposition

The relation between the minimal polynomial of the matrix and the decomposability.

17.7. Canonical Forms

17.7.1. Frobenius Normal Form

17.7.2. Weyr Normal Form

17.8. Singular Value Decomposition

SVD

  • Any matrix can be decomposed into three matrices: \(U, \Sigma, V\), where \(U\) and \(V\) are orthogonal, and \(\Sigma\) is rectangular and diagonal.

17.8.1. Definition

  • Decomposition of any rectangular matrix \(A\) into the form: \[ A = U\Sigma V^\top \]
    • where \(U\) and \(V\) are orthogonal matrices and \(\Sigma\) is a diagonal matrix.

17.8.2. Procedure

  • Notice that \(AA^{\rm T}\) and \(A^{\rm T}A\) are each a symmetric square matrix.
  • Eigenvectors of \(AA^{\rm T}\) are the left singular vectors of \(A\), and eigenvectors of \(A^{\rm T}A\) are the right singular vectors. On the other hand, the eigenvalues of the both matrices are equal to each other while the leftover eigenvalues are zero.
  • \(AA^{\rm T}\) and \(A^{\rm T}A\) are positive semidefinite, which means those eigenvalues are all greater than or equal to zero.
  • Square root of those eigenvalues are the singular values.
  • The columns of \(U\) are the left singular vectors in descending order, the diagonal entries of \(\Sigma\) are the singular values in descending order, and the columns of \(V\) are the right singular vectors in descending order.
  • \(A = U\Sigma V^{\rm T}\).

17.8.3. Properties

  • Each singular values multiplied by the corresponding left and right singular vectors represents a individual rank.
  • Singular value decomposition on the covariance matrix will give the principal components.2
  • The singular value is the eigenvalue of both the left and right symmetric matrices.
    • \[ AA^{\rm T} = U\Sigma \Sigma^{\rm T}U^{\rm T} \]
    • \[ A^{\rm T}A = V\Sigma^{\rm T}\Sigma V^{\rm T} \]

17.8.4. Singular Value

  • \(i\)th largest singular value is often denoted as \(\sigma_i(A)\).

17.8.5. Reduced SVD

17.8.5.1. Thin SVD
  • Number of columns of \(U\) and number of rows of \(V\) is equal to \(k := \min(m,n)\).
17.8.5.2. Compact SVD
  • The number is reduced to only include the nonzero singular values.
17.8.5.3. Truncated SVD
  • Truncate even the nonzero singular values. It is no longer the decomposition of the original matrix, but rather the optimal low-rank matrix approximation.

17.9. Choleskey Decomposition

17.9.1. Definition

A Hermitian positive-definite matrix \( A \) is decomposed in terms of a lower triangular matrix \( L \):

\begin{equation*} A = LL^*. \end{equation*}

17.9.2. LDL Decomposition

LDL decomposition requires the diagonal entries to be 1, and introduces D: \( A = LDL^*. \)

17.9.3. Intuition

A Hermitian positive-definite matrix \( A \) corresponds to an ellipsoid \( x^\top Ax = 1 \). The vectors \( \{ v_i \} \) for conjugate axes can be chosen sequentially to make an upper triangular matrix with the vectors being the column vectors. By the definition of conjugate axes

\begin{equation*} V^{\top}AV = I \end{equation*}

and the \( L \) can now be chosen to be \( L = (V^{-1})^{\top} \).

Similarly, the principal axes can be used to define \( V \). The orthogonal set of vectors can be normalized yielding the orthogonal matrix \( U \) and diagonal matrix \( \Sigma^{-1/2} \) with its entry being the norm of the vectors.:

\begin{equation*} V = U\Sigma^{-1/2}. \end{equation*}

\( A \) is now decomposed into:

\begin{equation*} A = U\Sigma U^{\top}. \end{equation*}

17.10. Polar Decomposition

17.10.1. Definition

A square real or complex matrix \( A \) can be decomposed into \[ A = UP \] where \( U \) is a unitary matrix and \( P \) is a positive semi-definite Hermitian matrix.

17.10.2. Properties

  • If \( A \) is invertible the decomposition is unique, with \( P \) being positive-definite.
    • And \( A \) can be uniquely written as \( Ue^X \) where \( X \) is the unique self-adjoint logarithm of the matrix \( P \).

17.11. Simultaneous Diagonalization by Congruence

A symmetric matrix \( A \) can always be decomposed into sum of simple symmetric matrices:

\begin{equation*} A = \sum_i a_i b_ib_i^{\top}. \end{equation*}

For a \( n \) by \( n \) symmetric matrices \( \{ A_i \}_{i=1}^n \) can always be simultaneously simply decomposed?

18. Matrix Relations

18.1. Row Equivalence

  • ~

18.1.1. Definition

  • Two matrices are row equivalent if one can be changed to the other by a sequence of elementary row operations.
  • Equivalently, they have the same row space.

18.1.2. Properties

  • Reversible, therefore it is an equivalence relation.

18.1.3. Elementary Row Operations

  • Swap: Swap two rows.
  • Scale: Multiply a row by a nonzero constant.
  • Pivot: Add a multiple of one row to another row.

18.2. Matrix Similarity

18.2.1. Definition

  • Tow \(n\times n\) matrices \(A\) and \(B\) are similar, if there exists a matrix \(M\) such that: \[ M^{-1}AM = B. \]

18.2.2. Properties

18.3. Matrix Congruence

18.3.1. Definition

  • Two square matrices \(A\) and \(B\) are congruent if: \[ P^{\rm T}AP = B \] where \(P\) is an invertible matrix.

18.3.2. Properties

18.4. Matrix Equivalence

18.4.1. Definition

  • Two rectangular \(m\times n\) matrices \(A\) and \(B\) are equivalent if: \[ B = Q^{-1}AP \] where \(Q\) is an 5.5[invertible]] \(m\times m\) matrix, and \(P\) is an invertible \(n\times n\) matrix.

18.4.2. Properties

  • They are the same linear transformation \(V\to W\) with respect to two different pair of sets of bases of \(V\) and \(W\).

19. Matrix Norms

19.1. Definition

  • A matrix norm \( \|\cdot\|\colon K^{m\times n}\to \mathbb{R} \) must satisfy
  • and additionally,
    • Sub-multiplicativity: \( \|AB\| \le \|A\|\,\|B\| \)

19.2. Induced Operator Norm

19.2.1. From Vector p-Norm

  • See p-norm
  • For a matrix \(A\colon K^n\to K^m\), \[ \|A\|_p := \sup_{x\neq 0}\frac{\| Ax\|_p}{\|x\|_p}. \]
19.2.1.1. Properties
  • \[ \|A\|_1 = \max_{1\le j\le n}\sum_{i=1}^m|a_{ij}| \]
  • Spectral Norm
    • \[ \|A\|_2 = \sqrt{\lambda_\text{max}(A^*A)} = \sigma_\text{max}(A) \]
  • \[ \|A\|_\infty = \max_{1\le i\le m}\sum_{j=1}^n|a_{ij}| \]

19.2.2. From Vector α- and β-Norm

  • \[ \|A\|_{\alpha,\beta} := \sup_{x\neq 0}\frac{\|Ax\|_\beta}{\|x\|_\alpha} \]
19.2.2.1. Properties
  • \[ \|A\|_{2,\infty} = \max_{1\le i\le m}\|A_{i:}\|_2 \]
  • \[ \|A\|_{1,2} = \max_{1\le j\le n}\|A_{:j}\|_2 \]
  • It is consistent with vector norms by definition, and subsequently \[ \|AB\|_{\alpha,\gamma} \le \|A\|_{\beta, \gamma}\|B\|_{\alpha, \beta}. \]

19.3. Entry-wise Norm

19.3.1. Lp,q Norm

  • \[ \| A \|_{p,q} := \left(\sum_{j=1}^n \left( \sum_{i=1}^m |a_{ij}|^p \right)^{\frac{q}{p}}\right)^{\frac{1}{q}}. \]
  • \[ \|A\|_{p,p} = \|\mathrm{vec}(A)\|_p = \left( \sum_{i=1}^m \sum_{j=1}^n |a_{ij}|^p \right)^{1/p} \]
19.3.1.1. Properties

19.3.2. Max Norm

  • \[ \|A\|_\text{max} := \max_{i,j}|a_{ij}| \]
19.3.2.1. Properties
  • This norm is not sub-multiplicative, but normalization by multiplying \(\sqrt{mn}\) will do.

19.4. Schatten Norm

  • Schatten p-Norm
  • p-norm of the singular values.
  • \[ \|A\|_p := \left( \sum_{i=1}^{\min\{m,n\}} \sigma_i(A)^p \right)^{1/p}. \]

19.4.1. Properites

  • \(p=1\) yields the Ky-Fan \(n\)-norm, or the nuclear norm \[ \|A\|_1 = \|A\|_* = \mathrm{tr}(\sqrt{A^*A}) = \sum_{i=1}^{\min\{m,n\}} \sigma_i(A) \]
    • since \(A^*A\) is a positive semidefinite matrix, the \(\sqrt{A^*A}\) is well-defined.
    • The nuclear norm \(\|A\|_*\) is a convex envelope of the rank function \(\mathrm{rank}(A)\), so it is often used in mathematical optimization to search for low-rank matrices.
    • Von Neumann's trace inequality and Hölder's inequality for Euclidean space yields:
      • For \(1/p + 1/q = 1\), \( |\mathrm{tr}(A^*B)| \le \|A\|_p\|B\|_q. \)
      • In particular, \[ \|A\|_\text{F}^2 \le \|A\|_p \|A\|_q. \]
  • \(p=2\) yields the Frobenius norm, \(p=\infty\) yields the spectral norm.

19.5. Ky Fan Norm

  • Ky Fan k-Norm
  • \[ \|A\|_k := \sum_{i=1}^k \sigma_i(A) \]

19.5.1. Properties

  • \(k=1\) yields the spectral norm.
  • \(k=n\) yields the nuclear norm.

19.6. Frobenius Inner Product

  • Hilbert-Schmidt Inner Product

19.6.1. Definition

  • For complex matrices \(\mathbf{A}\) and \(\mathbf{B}\) \[ \langle \mathbf{A}, \mathbf{B}\rangle_{\rm F} = \operatorname{tr}\left(\mathbf{A}^{\dagger}\mathbf{B}\right). \]
  • It is the sum of the element-wise products.

19.6.2. Frobenius Norm

  • Hilbert-Schmidt Norm
  • \[ \|\mathbf{A}\|_{\rm F} = \sqrt{\langle \mathbf{A}, \mathbf{A}\rangle_{\rm F}} \]
19.6.2.1. Properties
  • \[ \| \mathbf{A}\|_{\rm F} = \sqrt{\sum\nolimits_i \sigma_i(\mathbf{A})^2} \]
  • where \(\sigma_i(\mathbf{A})\) is the \(i\)th singular value of \(\mathbf{A}\).

19.6.3. Properties

19.7. Cut Norm

  • It measures the how close the associated graph is to being bipartite.
  • \[ \|A\|_{\Box} :=\max_{S\subseteq[n], T\subseteq[m]}{\left|\sum_{s\in S,t\in T}{A_{t,s}}\right|}. \]
  • It is equivalent to the induced operator norm \(\|\cdot\|_{\infty, 1}\),
  • and to the Grothendieck norm:
    • \[ \|A\|_{G,k}=\sup_{u_j, v_j\in K^k, \|u_j\| = \|v_j\| = 1}{\sum_{j \in [n], \ell \in [m]}{(u_j\cdot v_j) A_{\ell,j}}} \]
    • which is the norm of the extended operator \((K^k)^n \to (K^k)^m\).

19.8. Possible Properties

19.8.1. Consistency and Compatibility

  • A matrix norm \(\|\cdot\|\) on \(K^{m\times n}\) is consistent with the vector norms \(\|\cdot\|_\alpha\) on \(K^n\) and , \(\|\cdot\|_\beta\) on \(K^m\) if:
    • \[ \forall A\in K^{m\times n}, \forall x\in K^n, \|Ax\|_\beta \le \|A\|\,\|x\|_\alpha \]
  • Further, in the special case \(m=n\) and \(\alpha = \beta\), \(\|\cdot\|\) is called compatible with \(\|\cdot\|_\alpha\).

19.8.2. Monotone Norm

  • A matrix norm is monotone if it is monotomic with respect to the Loewner order, that is: \[ A \preccurlyeq B \implies \|A\| \le \|B\|. \]

19.8.3. Equivalence

  • Two norms are equivalent if \[ \exists r, s > 0: \forall A \in K^{m\times n}, r\|A\|_\alpha \le \|A\|_\beta \le s\|A\|_\alpha. \]
  • All norms on \(K^{m\times n}\) are equivalent. They induce the same topology.
  • Moreover, for every matrix norm on \(\mathbb{R}^{n\times n}\), there exists a unique positive real number \(k\) such that \(\ell\|\cdot \|\) is a sub-multiplicative matrix norm for every \(\ell \ge k\).
    • to wit, \(k = \sup\{\|AB\| : \|A\| < 1, \|B\| \le 1\}.\)
  • A sub-multiplicative matrix norm \(\|\cdot\|_\alpha\) is said to be minimal, if there exists no sub-multiplicative matrix norm \(\|\cdot\|_\beta\) satisfying \(\|\cdot \|_\beta < \|\cdot\|_\alpha\).

20. Matrix Order

20.1. Loewner Order

20.1.1. Definition

  • For two Hermitian matrices \(A, B\) of order \(n\), \[ A \ge B \iff A - B\text{ positive semidefinite} \]
  • and \[ A > B \iff A - B \text{ positive definite}. \]

21. Principal Square Root

21.1. Theorem

For a positive semidefinite symmetric matrix \( A \), there exists unique positive semidefinite symmetric matrix \( B \) such that \( B^2 = A \).

21.2. Solving for Principal Square Root

  • Diagonalization
  • Shur Decomposition
  • Jordan Decomposition
  • Power Series
  • Denman-Beavers Iteration
  • Babylonian Method

22. Matrix Pencil

22.1. Definition

Matrix pencil is a matrix-valued function \( L\colon K \to \mathrm{Mat}(K, n\times n) \) for a field \( K \) defined by:

\begin{equation*} L(\lambda) = \sum_{i=0}^m \lambda^iA_i \end{equation*}

where \( A_i \in \mathrm{Mat}(K, n\times n) \).

22.2. Etymology

Pencil has the meaning of rays that converges to or diverges from a point.

23. References

Footnotes:

Created: 2025-05-06 Tue 23:34